-
Notifications
You must be signed in to change notification settings - Fork 62
Simplify tests, shorten buildir name on windows #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I am not sure why CI is failing. When I run the tests locally they pass. |
Force the tests to re-run (unsure how), and see if they still fail? If they do, force a re-run on the base (unsure how) and see if they still pass? If all that is true, then dig deeper. |
The failure we get is genuine. I think you can get it too by running both |
When I run the tests locally the |
Maybe the CI runs pytest-xdist anyway? Or some other pytest version, or other plugins, etc.? Something that would make the order of execution different, e.g. |
I am stumped. No pytest-xdist plugin is used:
and the tests are run in the correct order:
When I can I will add debug cruft to |
Another strange thing I noticed in the codebase: there is both testing/cffi0/test_verify.py and testing/cffi1/test_verify1.py. These files echo each other: they have many of the same tests with slight variations. Some of these repeated tests are quite "expensive" on PyPy's nightly testing on windows. For instance:
|
Still stumped. I used |
f40b90e
to
4e7ebbd
Compare
CI was passing so I removed the debug cruft and enabled windows tests. Let's see how long it takes. I am not sure why when using a module name I had to create a unique module name but 🤷. I guess #135 avoided this problem by making all the module names unique. |
25 minutes for windows tests, I guess that is acceptable? It could be less if the repetitive tests are removed from cffi0/test_verify and cffi1/test_verify1.py |
cffi0 checks the verify() call that was in cffi 0.x.y, while cffi1 checks the new compile() call. This is very different pieces of code, even if they both implement a large common part. |
Ahh, makes sense, thanks for clarifying. CI is passing, including enabling windows tests. |
Rebased on main and cleaned up history. CC @ngoldbaum. I would consider this ready to merge if CI passes, tests on windows do not take too long, and all the tests (except embedding on windows) run. We should maybe open an issue to revert disabling embedded tests on windows (commit 8be5d8a), although if I understand correctly they were not run in CI before this commit anyway. I am not sure about the changes in the github CI yaml, each platform has a different
and windows uses
|
CI test comparison:
So windows is still ~8x slower than linux while running fewer tests 🤷 |
assert not os.path.exists(cfile) | ||
lib = ffi.verify('#include <math.h>', libraries=lib_m, modulename=modulename) | ||
assert lib.cos(1.23) == math.cos(1.23) | ||
assert not os.path.exists(cfile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if setting I personally find the linux version the most readable. The pip upgrade should happen elsewhere probably on Mac. It looks like |
I wonder what the original intent of |
Just to allow individual Windows matrix entries to override the test command. I'll sometimes enable a larger subset of tests for a small number of targets to verify things that are too slow to do for all Windows targets, esp since we often can't run all targets in parallel. I'll begrudgingly deal with a couple of 20-40 min test runs, but I don't want to sit around waiting for 20 of them running serially 😆. Running full tests on emulated arches is currently the largest contributor to the overall build time, which is usually why I'll end up turning those back down to just some minimal smoke tests against the built artifact. If it was just those, I wouldn't complain, but they tend to clog up the works because they'll eventually all run at the same time and block shorter targets from running. We could probably get fancy with concurrency groups to try and minimize the total runtime, but that's been a pretty low priority- I usually settle on something like at least one full test run for each platform/arch for release builds. Doing full tests on every Python/platform/arch would be nice, but there's definitely diminishing returns, especially with emulation or slow targets. |
Ahh, cool, so let’s leave that part alone. |
Taken from #135, without the pytest-xdist module name randomization
This should be smaller and easier to review: